基于高阶相关性与双重冗余驱动的全局多标签特征选择

doi:10.16451/j.cnki.issn1003-6059.202601003

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (1266 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要多标签特征选择是处理高维多标签数据的关键预处理技术.现有方法或因采用贪婪搜索策略而容易陷入局部最优,或在稀疏模型中对特征相关性与冗余性的度量不够充分.为此,文中提出基于高阶相关性与双重冗余驱动的全局多标签特征选择算法(Global Multi-label Feature Selection Algorithm Driven by Higher-Order Correlation and Dual Redundancy, GHC-DR).首先,引入基于多标签k近邻的模糊依赖度,准确评估特征与标签系统间的高阶相关性.然后,专注于特征的局部几何结构,构建特征图,捕捉特征间的局部相似性,并设计融合信息论与局部结构的双重冗余评估机制.最后,将高阶相关性、双重冗余性及标签相关性整合至一个统一的稀疏学习目标函数中,并给出高效的闭式解.在15个公开多标签基准数据集上的对比实验表明,GHC-DR在多个评估指标上均表现出性能优势.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	邓文
	折延宏
	郑文利
	贺晓丽
	钱婷

关键词 ：模糊粗糙集, 特征选择, 多标签学习, 特征冗余

Abstract：Multi-label feature selection is a critical preprocessing technique for handling high-dimensional multi-label data. However, existing approaches are often trapped in local optima due to greedy search strategies or unadequate measuring feature correlation and redundancy within sparse models. To address these issues, a global multi-label feature selection algorithm driven by higher-order correlation and dual redundancy(GHC-DR) is proposed. First, a fuzzy dependency measure based on multi-label k-nearest neighbors is introduced to accurately evaluate the higher-order correlations between features and the label system. Second, GHC-DR is designed to focus on the local geometric structure of features by constructing a feature graph to capture local similarities among features, and a dual redundancy evaluation mechanism fusing information theory with local structure is developed. Finally, higher-order correlation, dual redundancy and label correlations are integrated into a unified sparse learning objective function, and an efficient closed-form solution is derived. Experiments on 15 public multi-label benchmark datasets demonstrate the superior performance of GHC-DR across multiple evaluation metrics.

Key words： Fuzzy Rough Sets Feature Selection Multi-label Learning Feature Redundancy

收稿日期: 2025-12-23

ZTFLH:

TP18

基金资助:国家自然科学基金项目(No.12471442)资助

通讯作者: 折延宏,博士,教授,主要研究方向为机器学习、不确定性数据建模.E-mail:yanhongshe@xsyu.edu.cn.

作者简介: 邓文,硕士研究生,主要研究方向为模糊粗糙集、特征选择.E-mail:dw18866@163.com.
郑文利,博士,讲师,主要研究方向为机器学习、分层分类.E-mail:wlzheng@xsyu.edu.cn.
贺晓丽,博士,副教授,主要研究方向为不确定性推理、粒度计算.E-mail:qiant2000@126.com.
钱婷,博士,副教授,主要研究方向为粗糙集、概念格、不确定性推理.E-mail:hexl@xsyu.edu.cn.

引用本文:

邓文, 折延宏, 郑文利, 贺晓丽, 钱婷. 基于高阶相关性与双重冗余驱动的全局多标签特征选择[J]. 模式识别与人工智能, 2026, 39(1): 52-66. DENG Wen, SHE Yanhong, ZHENG Wenli, HE Xiaoli, QIAN Ting. Global Multi-label Feature Selection Driven by Higher-Order Correlation and Dual Redundancy. Pattern Recognition and Artificial Intelligence, 2026, 39(1): 52-66.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202601003 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2026/V39/I1/52

[1] WEI T Y, WANG X P, WU J X, et al. Interval Type-2 Possibilistic Fuzzy Clustering Noisy Image Segmentation Algorithm with Adaptive Spatial Constraints and Local Feature Weighting & Clustering Weigh-ting. International Journal of Approximate Reasoning, 2023, 157: 1-32.
[2] DENG X, FENG S H, LÜ G Y, et al. Beyond Word Embeddings: Heterogeneous Prior Knowledge Driven Multi-label Image Classification. IEEE Transactions on Multimedia, 2023, 25: 4013-4025.
[3] POCZETA K, PL/AZA M, MICHNO T, et al. A Multi-label Text Message Classification Method Designed for Applications in Call/Contact Centre Systems. Applied Soft Computing, 2023, 145. DOI: 10.1016/j.asoc.2023.110562.
[4] BOUTELL M R, LUO J B, SHEN X P, et al. Learning Multi-label Scene Classification. Pattern Recognition, 2004, 37(9): 1757-1771.
[5] QIAN W B, YE Q Z, LI Y H, et al. Relevance-Based Label Distribution Feature Selection via Convex Optimization. Information Sciences, 2022, 607: 322-345.
[6] ZHANG P, LIU G X, GAO W F, et al. Multi-label Feature Selec-tion Considering Label Supplementation. Pattern Recognition, 2021, 120. DOI: 10.1016/j.patcog.2021.108137.
[7] HU L, GAO L B, LI Y H, et al. Feature-Specific Mutual Information Variation for Multi-label Feature Selection. Information Sciences, 2022, 593: 449-471.
[8] PENG H C, LONG F H, DING C.Feature Selection Based on Mutual Information Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(8): 1226-1238.
[9] LEE J, KIM D.Fast Multi-Label Feature Selection Based on Infor-mation-Theoretic Feature Ranking. Pattern Recognition, 2015, 48(9): 2761-2771.
[10] DAI J H, CHEN J L, LIU Y, et al. Novel Multi-label Feature Selection via Label Symmetric Uncertainty Correlation Learning and Feature Redundancy Evaluation. Knowledge-Based Systems, 2020, 207. DOI: 10.1016/j.knosys.2020.106342.
[11] ZHANG P, LIU G X, GAO W F.Distinguishing Two Types of Labels for Multi-label Feature Selection. Pattern Recognition, 2019, 95: 72-82.
[12] BUGATA P, DROTAR P.On Some Aspects of Minimum Redundancy Maximum Relevance Feature Selection. Science China(Information Sciences), 2020, 63(1). DOI:10.1007/s11432-019-2633-y.
[13] ZHANG J, LIN Y D, JIANG M, et al. Fast Multilabel Feature Selection via Global Relevance and Redundancy Optimization. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(4): 5721-5734.
[14] ROFFO G, MELZI S, CASTELLANI U, et al.Infinite Feature Selection: A Graph-Based Feature Filtering Approach. IEEE Transa-ctions on Pattern Analysis and Machine Intelligence, 2021, 43(12): 4396-4410.
[15] YIN T Y, CHEN H M, YUAN Z, et al. Noise-Resistant Multilabel Fuzzy Neighborhood Rough Sets for Feature Subset Selection. Information Sciences, 2023, 621: 200-226.
[16] ZHOU G Z, LI R X, SHANG Z H, et al. Multi-label Feature Selection Based on Minimizing Feature Redundancy of Mutual Information. Neurocomputing, 2024, 607. DOI: 10.1016/j.neucom.2024.128392.
[17] FAN Y L, CHEN B H, HUANG W Q, et al. Multi-label Feature Selection Based on Label Correlations and Feature Redundancy. Knowledge-Based Systems, 2022, 241. DOI: 10.1016/j.knosys.2022.108256.
[18] LEE J, KIM D.Feature Selection for Multi-label Classification Using Multivariate Mutual Information. Pattern Recognition Letters, 2013, 34(3): 349-357.
[19] LIN Y J, HU Q H, LIU J H, et al. Streaming Feature Selection for Multilabel Learning Based on Fuzzy Mutual Information. IEEE Transactions on Fuzzy Systems, 2017, 25(6): 1491-1507.
[20] LIAO C W, YANG B.A Novel Multi-label Feature Selection Me-thod Based on Conditional Entropy and Its Acceleration Mechanism. International Journal of Approximate Reasoning, 2025, 185. DOI: 10.1016/j.ijar.2025.109469.
[21] DAI J H, HUANG W Y, ZHANG C C, et al. Multi-label Feature Selection by Strongly Relevant Label Gain and Label Mutual Aid. Pattern Recognition, 2024, 145. DOI: 10.1016/j.patcog.2023.109945.
[22] LIU J H, LIN Y J, DU J X, et al. ASFS: A Novel Streaming Feature Selection for Multi-label Data Based on Neighborhood Rough Set. Applied Intelligence, 2022, 53(2): 1707-1724.
[23] SUN L, YIN T Y, DING W P, et al. Feature Selection with Mi-ssing Labels Using Multilabel Fuzzy Neighborhood Rough Sets and Maximum Relevance Minimum Redundancy. IEEE Transactions on Fuzzy Systems, 2022, 30(5): 1197-1211.
[24] LIU J H, LIN Y J, DING W P, et al. Multi-label Feature Selection Based on Label Distribution and Neighborhood Rough Set. Neurocomputing, 2023, 524: 142-157.
[25] HUANG J, LI G R, HUANG Q M, et al. Learning Label Specific Features for Multi-label Classification // Proc of the IEEE International Conference on Data Mining. Washington, USA: IEEE, 2015: 181-190.
[26] ZHANG M L, WU L.LIFT: Multi-label Learning with Label-Spe-cific Features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107-120.
[27] REN T T, JIA X Y, LI W W, et al. Label Distribution Learning with Label-Specific Features // Proc of the 28th International Joint Conference on Artificial Intelligence. San Francisco, USA: IJCAI, 2019: 3318-3324.
[28] CAI Z L, ZHU W.Multi-label Feature Selection via Feature Manifold Learning and Sparsity Regularization. International Journal of Machine Learning and Cybernetics, 2018, 9(8): 1321-1334.
[29] MURALI V. Fuzzy Equivalence Relations. Fuzzy Sets and Systems, 1989, 30(2): 155-163.
[30] DUBOIS D, PRADE H.Rough Fuzzy Sets and Fuzzy Rough Sets. International Journal of General Systems, 1990, 17(2/3): 191-209.
[31] YIN T Y, CHEN H M, YUAN Z, et al. A Robust Multilabel Feature Selection Approach Based on Graph Structure Considering Fuzzy Dependency and Feature Interaction. IEEE Transactions on Fuzzy Systems, 2023, 31(12): 4516-4528.
[32] WU B Y, LIU Z L, WANG S F, et al. Multi-label Learning with Missing Labels // Proc of the 22nd International Conference on Pattern Recognition. Washington, USA: IEEE, 2014: 1964-1968.
[33] HE Z X, LIN Y J, LIN Z L, et al. Multi-label Feature Selection via Similarity Constraints with Non-negative Matrix Factorization. Knowledge-Based Systems, 2024, 297. DOI: 10.1016/j.knosys.2024.111948.
[34] ZHANG J, LUO Z M, LI C D, et al. Manifold Regularized Discri-minative Feature Selection for Multi-label Learning. Pattern Recognition, 2019, 95: 136-150.
[35] WU Y, LI P P, ZOU Y Z.Partial Multi-label Feature Selection with Feature Noise. Pattern Recognition, 2025, 162. DOI: 10.1016/j.patcog.2024.111310.
[36] SHANG R H, ZHONG J Y, ZHANG W T, et al. Multilabel Feature Selection via Shared Latent Sublabel Structure and Simultaneous Orthogonal Basis Clustering. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(3): 5288-5303.
[37] FRIEDMAN M.A Comparison of Alternative Tests of Significance for the Problem of m Rankings. The Annals of Mathematical Statistics, 1940, 11(1): 86-92.
[38] DUNN O J.Multiple Comparisons Among Means. Journal of the American Statistical Association, 1961, 56(293): 52-64.
[39] DEMŠAR J. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research, 2006, 7: 1-30.